AITopics | anomalous sample

Collaborating Authors

anomalous sample

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

Stephan Rabanser, Stephan Günnemann, Zachary Lipton

Neural Information Processing SystemsFeb-12-2026, 19:21:17 GMT

This paper explores the problem of building ML systems that failloudly, investigating methods for detecting dataset shift, identifying exemplarsthat most typify the shift, and quantifying shift malignancy. We focus on severaldatasets and various perturbations to both covariates and label distributions withvarying magnitudes and fractions of data affected. Interestingly, we show thatacross the dataset shifts that we explore, a two-sample-testing-based approach,using pre-trained classifiers for dimensionality reduction, performs best.

classifier, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > UAE (0.06)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

DeNoise: Learning Robust Graph Representations for Unsupervised Graph-Level Anomaly Detection

Chen, Qingfeng, Zeng, Haojin, Jie, Jingyi, Zhang, Shichao, Cheng, Debo

arXiv.org Artificial IntelligenceNov-7-2025

With the rapid growth of graph-structured data in critical domains, unsupervised graph-level anomaly detection (UGAD) has become a pivotal task. UGAD seeks to identify entire graphs that deviate from normal behavioral patterns. However, most Graph Neural Network (GNN) approaches implicitly assume that the training set is clean, containing only normal graphs, which is rarely true in practice. Even modest contamination by anomalous graphs can distort learned representations and sharply degrade performance. To address this challenge, we propose DeNoise, a robust UGAD framework explicitly designed for contaminated training data. It jointly optimizes a graph-level encoder, an attribute decoder, and a structure decoder via an adversarial objective to learn noise-resistant embeddings. Further, DeNoise introduces an encoder anchor-alignment denoising mechanism that fuses high-information node embeddings from normal graphs into all graph embeddings, improving representation quality while suppressing anomaly interference. A contrastive learning component then compacts normal graph embeddings and repels anomalous ones in the latent space. Extensive experiments on eight real-world datasets demonstrate that DeNoise consistently learns reliable graph-level representations under varying noise intensities and significantly outperforms state-of-the-art UGAD baselines.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.04086

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning

Aqeel, Muhammad, Sharifi, Shakiba, Cristani, Marco, Setti, Francesco

arXiv.org Artificial IntelligenceOct-29-2025

So-called unsupervised anomaly detection is better described as semi-supervised, as it assumes all training data are nominal. This assumption simplifies training but requires manual data curation, introducing bias and limiting adaptability. W e propose Confident Meta-learning (CoMet), a novel training strategy that enables deep anomaly detection models to learn from uncurated datasets where nominal and anomalous samples coexist, eliminating the need for explicit filtering. Our approach integrates Soft Confident Learning, which assigns lower weights to low-confidence samples, and Meta-Learning, which stabilizes training by regularizing updates based on training-validation loss covariance. This prevents overfitting and enhances robustness to noisy data. CoMet is model-agnostic and can be applied to any anomaly detection method train-able via gradient descent. Experiments on MVT ec-AD, VIADUCT, and KSDD2 with two state-of-the-art models demonstrate the effectiveness of our approach, consistently improving over the baseline methods, remaining insensitive to anomalies in the training set, and setting a new state-of-the-art across all datasets.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2508.02293

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

A Novel GPT-Based Framework for Anomaly Detection in System Logs

Zhang, Zeng, Yin, Wenjie, Li, Xiaoqi

arXiv.org Artificial IntelligenceOct-21-2025

Identification of anomalous events within system logs constitutes a pivotal element within the frame- work of cybersecurity defense strategies. However, this process faces numerous challenges, including the management of substantial data volumes, the distribution of anomalies, and the precision of con- ventional methods. To address this issue, the present paper puts forward a proposal for an intelligent detection method for system logs based on Genera- tive Pre-trained Transformers (GPT). The efficacy of this approach is attributable to a combination of structured input design and a Focal Loss op- timization strategy, which collectively result in a substantial enhancement of the performance of log anomaly detection. The initial approach involves the conversion of raw logs into event ID sequences through the use of the Drain parser. Subsequently, the Focal Loss loss function is employed to address the issue of class imbalance. The experimental re- sults demonstrate that the optimized GPT-2 model significantly outperforms the unoptimized model in a range of key metrics, including precision, recall, and F1 score. In specific tasks, comparable or superior performance has been demonstrated to that of the GPT-3.5 API.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.16044

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Root Cause Analysis of Outliers in Unknown Cyclic Graphs

Schkoda, Daniela, Janzing, Dominik

arXiv.org Machine LearningOct-9-2025

We study the propagation of outliers in cyclic causal graphs with linear structural equations, tracing them back to one or several "root cause" nodes. We show that it is possible to identify a short list of potential root causes provided that the perturbation is sufficiently strong and propagates according to the same structural equations as in the normal mode. This shortlist consists of the true root causes together with those of its parents lying on a cycle with the root cause. Notably, our method does not require prior knowledge of the causal graph.

matrix, node, proceedings, (13 more...)

arXiv.org Machine Learning

2510.06995

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Anomalous Samples for Few-Shot Anomaly Detection

Abdali, Aymane, Boguslawski, Bartosz, Drumetz, Lucas, Gripon, Vincent

arXiv.org Artificial IntelligenceAug-1-2025

Several anomaly detection and classification methods rely on large amounts of non-anomalous or "normal" samples under the assump- tion that anomalous data is typically harder to acquire. This hypothesis becomes questionable in Few-Shot settings, where as little as one anno- tated sample can make a significant difference. In this paper, we tackle the question of utilizing anomalous samples in training a model for bi- nary anomaly classification. We propose a methodology that incorporates anomalous samples in a multi-score anomaly detection score leveraging recent Zero-Shot and memory-based techniques. We compare the utility of anomalous samples to that of regular samples and study the benefits and limitations of each. In addition, we propose an augmentation-based validation technique to optimize the aggregation of the different anomaly scores and demonstrate its effectiveness on popular industrial anomaly detection datasets.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2507.23712

Country: Europe > France (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MLASDO: a software tool to detect and explain clinical and omics inconsistencies applied to the Parkinson's Progression Markers Initiative cohort

Pardo, José A., Bernal, Tomás, Ñiguez, Jaime, Gil-Martínez, Ana Luisa, Ibañez, Laura, Palma, José T., Botía, Juan A., Gómez-Pascual, Alicia

arXiv.org Artificial IntelligenceJul-8-2025

Inconsistencies between clinical and omics data may arise within medical cohorts. The identification, annotation and explanation of anomalous omics-based patients or individuals may become crucial to better reshape the disease, e.g., by detecting early onsets signaled by the omics and undetectable from observable symptoms. Here, we developed MLASDO (Machine Learning based Anomalous Sample Detection on Omics), a new method and software tool to identify, characterize and automatically describe anomalous samples based on omics data. Its workflow is based on three steps: (1) classification of healthy and cases individuals using a support vector machine algorithm; (2) detection of anomalous samples within groups; (3) explanation of anomalous individuals based on clinical data and expert knowledge. We showcase MLASDO using transcriptomics data of 317 healthy controls (HC) and 465 Parkinson's disease (PD) cases from the Parkinson's Progression Markers Initiative. In this cohort, MLASDO detected 15 anomalous HC with a PD-like transcriptomic signature and PD-like clinical features, including a lower proportion of CD4/CD8 naive T-cells and CD4 memory T-cells compared to HC (P<3.5*10^-3). MLASDO also identified 22 anomalous PD cases with a transcriptomic signature more similar to that of HC and some clinical features more similar to HC, including a lower proportion of mature neutrophils compared to PD cases (P<6*10^-3). In summary, MLASDO is a powerful tool that can help the clinician to detect and explain anomalous HC and cases of interest to be followed up. MLASDO is an open-source R package available at: https://github.com/JoseAdrian3/MLASDO.

artificial intelligence, covariate, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.03656

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Spain (0.04)

Genre:

Workflow (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback

A New Spatiotemporal Correlation Anomaly Detection Method that Integrates Contrastive Learning and Few-Shot Learning in Wireless Sensor Networks

Ye, Miao, Wang, Suxiao, Han, Jiaguang, Wang, Yong, Wang, Xiaoli, Wei, Jingxuan, Wen, Peng, Cui, Jing

arXiv.org Artificial IntelligenceJun-3-2025

Detecting anomalies in the data collected by WSNs can provide crucial evidence for assessing the reliability and stability of WSNs. Existing methods for WSN anomaly detection often face challenges such as the limited extraction of spatiotemporal correlation features, the absence of sample labels, few anomaly samples, and an imbalanced sample distribution. To address these issues, a spatiotemporal correlation detection model (MTAD-RD) considering both model architecture and a two-stage training strategy perspective is proposed. In terms of model structure design, the proposed MTAD-RD backbone network includes a retentive network (RetNet) enhanced by a cross-retention (CR) module, a multigranular feature fusion module, and a graph attention network module to extract internode correlation information. This proposed model can integrate the intermodal correlation features and spatial features of WSN neighbor nodes while extracting global information from time series data. Moreover, its serialized inference characteristic can remarkably reduce inference overhead. For model training, a two-stage training approach was designed. First, a contrastive learning proxy task was designed for time series data with graph structure information in WSNs, enabling the backbone network to learn transferable features from unlabeled data using unsupervised contrastive learning methods, thereby addressing the issue of missing sample labels in the dataset. Then, a caching-based sample sampler was designed to divide samples into few-shot and contrastive learning data. A specific joint loss function was developed to jointly train the dual-graph discriminator network to address the problem of sample imbalance effectively. In experiments carried out on real public datasets, the designed MTAD-RD anomaly detection method achieved an F1 score of 90.97%, outperforming existing supervised WSN anomaly detection methods.

data mining, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2506.0042

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Strengthening Anomaly Awareness

Banda, Adam, Khosa, Charanjit K., Sanz, Veronica

arXiv.org Artificial IntelligenceApr-17-2025

We present a refined version of the Anomaly Awareness framework for enhancing unsupervised anomaly detection. Our approach introduces minimal supervision into Variational Autoencoders (VAEs) through a two-stage training strategy: the model is first trained in an unsupervised manner on background data, and then fine-tuned using a small sample of labeled anomalies to encourage larger reconstruction errors for anomalous samples. We validate the method across diverse domains, including the MNIST dataset with synthetic anomalies, network intrusion data from the CICIDS benchmark, collider physics data from the LHCO2020 dataset, and simulated events from the Standard Model Effective Field Theory (SMEFT). The latter provides a realistic example of subtle kinematic deviations in Higgs boson production. In all cases, the model demonstrates improved sensitivity to unseen anomalies, achieving better separation between normal and anomalous samples. These results indicate that even limited anomaly information, when incorporated through targeted fine-tuning, can substantially improve the generalization and performance of unsupervised models for anomaly detection.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.1152

Country: Europe (0.46)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Unsupervised Anomaly Detection through Mass Repulsing Optimal Transport

Montesuma, Eduardo Fernandes, Habazi, Adel El, Mboula, Fred Ngole

arXiv.org Machine LearningFeb-18-2025

An anomaly, or an outlier, is a data point that is significantly different from the remaining data [Aggarwal, 2017], to such an extent that it was likely generated by a different mechanism [Hawkins, 1980]. From the perspective of machine learning, Anomaly Detection (AD) wants to determine, from a set of examples, which ones are likely anomalies, typically through a score. This problem finds applications in many different fields, such as medicine Salem et al. [2013], cyber-security Siddiqui et al. [2019], and system monitoring Isermann [2006], to name a few. As reviewed in Han et al. [2022], existing techniques for AD are usually divided into unsupervised, semi-supervised and supervised approaches, with an increasing need for labeled data. In this paper, we focus on unsupervised AD, which does not need further labeling effort in constituting datasets. As discussed in Livernoche et al. [2024], the growing number of applications involving high-dimensional and complex data begs the need for non-parametric algorithms.

data mining, machine learning, mass repulsing optimal transport, (13 more...)

arXiv.org Machine Learning

2502.12793

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback